WEBVTT 00:00.320 --> 00:03.840 Hi! My name is Chuan Yan and 00:03.840 --> 00:07.880 I’ll introduce our work FlatMagic: Improving Flat Colorization 00:07.880 --> 00:11.400 through AI-driven Design for Digital Comic Professionals. 00:11.400 --> 00:14.920 This work is a collaboration with Yotam 00:14.920 --> 00:20.480 Gingold, Ray Hong from George Mason University, John Chung, Eytan Adar 00:20.480 --> 00:23.480 from University of Michigan and Kiheon 00:23.480 --> 00:26.000 Yoon from Pusan National University. 00:27.480 --> 00:30.200 The digitized drawing technology has greatly improved 00:30.640 --> 00:32.120 the artist’s work efficiency. 00:32.120 --> 00:36.720 However, their workload may still not easy as industry standards improved. 00:36.720 --> 00:41.760 For example, the rapidly growing web comic has a much higher degree 00:41.760 --> 00:45.560 of colorization, which even bring greater workload than before. 00:46.720 --> 00:51.200 On the other hand, there are also many research works that 00:51.200 --> 00:56.520 try to automate the colorization stage and many of them show impressive results. 00:56.520 --> 01:01.400 However, few of those technologies have been widely used by professional artists. 01:01.400 --> 01:05.480 Therefore, to figure out the reason behind, we interviewed five professionals 01:05.480 --> 01:08.840 who have published their work in commercial comic distribution 01:08.840 --> 01:13.320 platforms for more than three years And we aiming to understand: 01:13.320 --> 01:16.440 their workflow, challenges in their work, general 01:16.440 --> 01:19.600 thoughts and expecting features on AI tools, 01:19.600 --> 01:24.360 and the key factors that affect them adopting AI automation. 01:24.640 --> 01:29.040 The interview result shows there exists a common workflow, which include: 01:29.040 --> 01:32.760 Line drawing, Flat colorization, Shading, Lighting, and Special Effects. 01:32.760 --> 01:37.920 Additional analysis identified one special stage in this workflow 01:37.920 --> 01:44.880 - Flatting, which shows several characteristics that particularly fit for AI-driven automation. 01:44.880 --> 01:49.240 We will discuss the reasoning behind in a few next slides. 01:49.240 --> 01:53.080 First of all, what is flatting? Flatting is a colorization stage 01:53.080 --> 01:56.600 where the artists create a set of mutually exclusive layers. 01:56.600 --> 01:57.760 And there exists 01:57.760 --> 02:01.360 two industry requirements: consistency and completeness. 02:01.360 --> 02:05.880 Flat consistency means the desired color regions are consistently correct. 02:05.880 --> 02:10.240 they should always neither contain mergeable regions 02:10.240 --> 02:13.000 nor spill out to their neighbor regions. 02:13.000 --> 02:16.360 Therefore, professionals need to find and fix the falsely closed and falsely 02:16.360 --> 02:20.280 opened lines before they bucket filling, which need them 02:20.280 --> 02:23.640 to pay full attention. 02:23.640 --> 02:27.520 The flat completeness requires the flat results only contain 02:27.520 --> 02:28.400 mutual exclusive, 02:28.400 --> 02:33.400 or no overlapping regions that cover the whole drawing space. Common failure 02:33.400 --> 02:37.600 cases in the requirement are the “dirty bits” caused by anti-aliasing 02:37.600 --> 02:42.160 pixels and gaps under line drawings when use the naive bucketing tools. 02:42.160 --> 02:46.520 And fixing those problems is usually tedious and time consuming. 02:47.920 --> 02:52.000 On the other side, participants also mentioned “lack of control” is 02:52.000 --> 02:54.960 one of the issues that keep them from adopting the AI tools. 02:54.960 --> 03:00.720 Because the output of AI usually can’t exactly match professional’s target. 03:00.720 --> 03:03.480 However, it is difficult or even impossible to adjust. 03:03.840 --> 03:07.920 Trying to fix those outputs will even need more effort than they 03:07.920 --> 03:11.800 colorize from scratch. It is also difficult for AI tools 03:11.800 --> 03:15.720 to find out the professional’s idea work in their mind. 03:15.720 --> 03:20.040 The result will be highly personalized and there always exists 03:20.040 --> 03:22.360 multi targets for each single input. 03:23.640 --> 03:27.400 To sum up, we found that, different from other colorization 03:27.400 --> 03:31.200 stages, flatting has only one “ground truth” for each line 03:31.200 --> 03:34.800 drawing. Meanwhile, it is also a bottleneck in the workflow 03:34.800 --> 03:38.400 since its costly workload when using the general drawing tools. 03:39.480 --> 03:43.080 So any speed up on this stage can help professionals invest 03:43.320 --> 03:47.280 more resources into later stages that matter more to the overall product 03:47.280 --> 03:51.120 quality. Therefore, they are more open to let automate tools 03:51.120 --> 03:53.560 deal with this less creative stage 03:54.480 --> 03:56.920 So, we designed the FlatMagic. Given 03:56.920 --> 04:01.360 an input line drawing, how to design an efficient flatting method? 04:01.360 --> 04:05.040 We have discussed that professionals usually face artifacts 04:05.040 --> 04:09.880 such as wrong flat region and “dirty bits” by naive bucket tools. 04:10.880 --> 04:11.720 Therefore, it 04:11.720 --> 04:15.120 is natural to build the interface of FlatMagic 04:15.120 --> 04:18.960 around this bucketing interaction to let user bucket freely 04:18.960 --> 04:25.520 and get a most correct result. Because flatting is color agnostic. 04:25.520 --> 04:28.480 We built a neural network to simplify the input lines. 04:28.480 --> 04:29.360 The target 04:29.360 --> 04:34.520 of the simplification is created by extracting 1-pixel width boundary 04:34.520 --> 04:38.440 from the idea flat result, We called this procedure 04:38.440 --> 04:42.520 as neural re-line. Then we trained the neural re-line model based on a ground 04:42.520 --> 04:45.720 truth dataset shared with us by a digital comic company. 04:45.720 --> 04:47.640 Then fill the simplified lines 04:47.640 --> 04:51.160 with different labels, and post process it to initial flatting result. 04:51.440 --> 04:57.560 The user interface is implemented as an Adobe Photoshop plugin. 04:57.560 --> 05:00.320 It receives the result from the neural reline 05:00.320 --> 05:03.800 model, and allow users to manually colorize it 05:03.800 --> 05:07.800 to the desired professional level result by our improved bucketing tool. 05:09.000 --> 05:10.880 To verify the efficiency of our FlatMagic. 05:10.880 --> 05:14.640 We recruited 16 students who are majoring digital 05:15.840 --> 05:18.520 comic and animation, then asked them to flat 12 line 05:18.520 --> 05:23.360 drawings by their best practice and by FlatMagic, respectively. 05:24.440 --> 05:28.680 The flat time distribution result shows using FlatMagic helped participants 05:28.680 --> 05:31.800 to significantly reduce their labor. And the outcomes 05:31.800 --> 05:35.640 quality measurement shows participants could achieving similar or even slight 05:35.640 --> 05:40.800 better outcomes to their best practice by using our FlatMagic. 05:40.800 --> 05:44.440 So, here are some interesting high-level insights on our finding 05:44.440 --> 05:47.520 which we called it intermediate representations. 05:47.520 --> 05:49.760 Most of the AI’s automation 05:49.760 --> 05:54.120 methods provide a full automation with no revisable check points, while 05:54.120 --> 05:59.520 most professionals did not like this, which we call it AI over-scoping. 05:59.520 --> 06:03.240 In contrast, intermediate representations offer a modularized architecture 06:03.240 --> 06:06.480 and representation so that makes it possible for users 06:06.480 --> 06:08.520 to know where and how to make necessary corrections. 06:08.520 --> 06:13.360 So, we reflect on some possible decision points 06:13.360 --> 06:16.000 when considering the design of AI systems. 06:16.000 --> 06:22.680 First, we need to understand the detailed, stage by stage 06:22.680 --> 06:25.000 workflow of professional’s task. Then carefully evaluate the degree 06:25.000 --> 06:29.640 of AI usefulness and User intention at each stage in the workflow. 06:29.640 --> 06:34.400 AI usefulness reflects how much work labor AI could cut down for the user. 06:34.400 --> 06:38.520 User intention means how much a user desires 06:38.520 --> 06:42.280 to let a particular step automated When consider 06:42.280 --> 06:46.480 AI-usefulness and user intention as two different dimensions. 06:46.480 --> 06:50.360 We can project all the AI automation methods on to the strategic space 06:50.360 --> 06:55.240 for considering how and if this tool may be adopted. 06:55.240 --> 07:01.080 The most promising case would be S1, where the AI is useful 07:01.080 --> 07:05.720 and its valuation is positive. In S2, when the AI is useful, 07:05.720 --> 07:09.520 we may still have a situation where an individual either doesn’t want 07:09.520 --> 07:12.960 or doesn’t think they want, the automation. This suggests us that 07:12.960 --> 07:17.360 it is worth considering a better adoption strategies to convince a user to adopt. 07:18.720 --> 07:22.320 In S3, When the end-user may want something automated, 07:22.320 --> 07:27.120 but the AI is unable to meet the object. Introducing a problematic 07:27.120 --> 07:31.960 AI will discourage adopting the current or even all other similar tools. 07:31.960 --> 07:35.960 This suggest us maybe finding a smaller intermediate representation 07:35.960 --> 07:39.560 that gives a more reasonable AI-automation scope is needed. 07:39.560 --> 07:45.720 In S4, while this worst case may be a “back to the drawing board” 07:45.720 --> 07:50.680 situation, However, it may also could be the result of didn't correctly model 07:50.680 --> 07:53.880 the needs, didn't build the right UI or did 07:53.880 --> 07:57.200 not implement the algorithm well. 07:57.200 --> 08:00.960 After we decide the right AI-automation scoping, we need to be careful 08:00.960 --> 08:05.040 of merging the AI driven automation scope when build up the tool. 08:05.040 --> 08:09.120 Cause merging multiple steps could ‘break’ a user’s mental model of the process. 08:09.120 --> 08:12.880 The user may not be able to identify how or why 08:12.880 --> 08:16.040 errors are occurring or how to correct for them. 08:17.440 --> 08:20.160 At last, I want to thank all my collaborators 08:20.160 --> 08:25.000 and Thank you for listening. Feel free to reach me out if you have any questions.